Goto

Collaborating Authors

 particular value


On Probability versus Likelihood. A discussion about two terms that are…

#artificialintelligence

From the perspective of machine learning and data science, probabilities and likelihoods are used to quantify uncertainty, or perhaps how probable it is that an observation belongs to one class or another. They crop up when looking at confusion matrices; and indeed, algorithms like Naive Bayes classification are pretty much probabilistic models. The reality is that data scientists cannot escape these concepts. In everyday language, though, we tend to use the terms probability and likelihood almost interchangeably. Indeed, it's not uncommon to hear things like'how likely is it to rain today?' or'what are the chances of this or that happening?'


Machine Learning - Visualized

#artificialintelligence

In the traditional hard-coded approach, we program a computer to perform a certain task. We tell it exactly what to do when it receives a certain input. In mathematical terms, this is like saying that we write the f(x) such that when users feed the input x into f(x), it gives the correct output y. In machine learning, however, we have a large set of inputs x and corresponding outputs y but not the function f(x). The goal here is to find the f(x) that transforms the input x into the output y.


Improving performance of random forests for a particular value of outcome by adding chosen features

#artificialintelligence

Choosing features to improve a performance of a particular algorithm is a difficult question. Currently here is PCA, which is hard to understand (although it can be used out-of-the-box), is not easy to interpret and requires centralizing and scaling of features. In addition, it does not allow to improve prediction performance for a particular outcome (if its accuracy is lower than for others or it has a particular importance). My method enables to use features without preprocessing. Therefore a resulting prediction is easy to explain.


Improving performance of random forests for a particular value of outcome by adding chosen features

#artificialintelligence

Choosing features to improve a performance of a particular algorithm is a difficult question. Currently here is PCA, which is difficult to understand (although it can be used out-of-the-box), requires centralizing and scaling of features and is not easy to interpret. In addition, it does not allows to improve prediction performance for a particular outcome (if its accuracy is lower than for others or it has a particular importance). My method enables to use features without preprocessing. Therefore a resulting prediction is easy to explain.